{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.011558,
     "end_time": "2021-04-20T21:06:30.038601",
     "exception": false,
     "start_time": "2021-04-20T21:06:30.027043",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Using Dagitty in the Analysis of Impact of 401(k) on Net Financial Wealth\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_execution_state": "idle",
    "_uuid": "051d70d956493feee0c6d64651c6a088724dca2a",
    "papermill": {
     "duration": 25.408317,
     "end_time": "2021-04-20T21:06:55.456487",
     "exception": false,
     "start_time": "2021-04-20T21:06:30.048170",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "#install and load package\n",
    "install.packages(\"dagitty\")\n",
    "install.packages(\"ggdag\")\n",
    "library(dagitty)\n",
    "library(ggdag)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.010708,
     "end_time": "2021-04-20T21:06:55.479480",
     "exception": false,
     "start_time": "2021-04-20T21:06:55.468772",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Graphs for 401(K) Analsyis\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.010554,
     "end_time": "2021-04-20T21:06:55.500684",
     "exception": false,
     "start_time": "2021-04-20T21:06:55.490130",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Here we have\n",
    " # * $Y$ -- net financial assets;\n",
    " # * $X$ -- worker characteristics (income, family size, other retirement plans; see lecture notes for details);\n",
    " # * $F$ -- latent (unobserved) firm characteristics\n",
    " # * $D$ -- 401(K) eligibility, deterimined by $F$ and $X$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.010886,
     "end_time": "2021-04-20T21:06:55.522231",
     "exception": false,
     "start_time": "2021-04-20T21:06:55.511345",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# State one graph (where F determines X) and plot it\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.960286,
     "end_time": "2021-04-20T21:06:56.493094",
     "exception": false,
     "start_time": "2021-04-20T21:06:55.532808",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "#generate a DAGs and plot them\n",
    "\n",
    "G1 = dagitty('dag{\n",
    "Y [outcome,pos=\"4, 0\"]\n",
    "D [exposure,pos=\"0, 0\"]\n",
    "X [confounder, pos=\"2,-2\"]\n",
    "F [uobserved, pos=\"0, -1\"]\n",
    "D -> Y\n",
    "X -> D\n",
    "F -> X\n",
    "F -> D\n",
    "X -> Y}')\n",
    "\n",
    "\n",
    "ggdag(G1)+  theme_dag()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.011939,
     "end_time": "2021-04-20T21:06:56.517753",
     "exception": false,
     "start_time": "2021-04-20T21:06:56.505814",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# List minimal adjustment sets to identify causal effecs $D \\to Y$\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.047008,
     "end_time": "2021-04-20T21:06:56.576507",
     "exception": false,
     "start_time": "2021-04-20T21:06:56.529499",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "adjustmentSets( G1, \"D\", \"Y\",effect=\"total\" ) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.012126,
     "end_time": "2021-04-20T21:06:56.600815",
     "exception": false,
     "start_time": "2021-04-20T21:06:56.588689",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# What is the underlying principle? \n",
    "\n",
    "Here condition on X blocks backdoor paths from Y to D (Pearl).  Dagitty correctly finds X (and does many more correct decisions, when we consider more elaborate structures. Why do we want to consider more elaborate structures? The very empirical problem requires us to do so!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.012073,
     "end_time": "2021-04-20T21:06:56.625022",
     "exception": false,
     "start_time": "2021-04-20T21:06:56.612949",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Another Graph (wherere $X$ determine $F$):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.380125,
     "end_time": "2021-04-20T21:06:57.017337",
     "exception": false,
     "start_time": "2021-04-20T21:06:56.637212",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "#generate a couple of DAGs and plot them\n",
    "\n",
    "G2 = dagitty('dag{\n",
    "Y [outcome,pos=\"4, 0\"]\n",
    "D [exposure,pos=\"0, 0\"]\n",
    "X [confounder, pos=\"2,-2\"]\n",
    "F [uobserved, pos=\"0, -1\"]\n",
    "D -> Y\n",
    "X -> D\n",
    "X -> F\n",
    "F -> D\n",
    "X -> Y}')\n",
    "\n",
    "\n",
    "ggdag(G2)+  theme_dag()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.056051,
     "end_time": "2021-04-20T21:06:57.094387",
     "exception": false,
     "start_time": "2021-04-20T21:06:57.038336",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "adjustmentSets( G2, \"D\", \"Y\", effect=\"total\" )\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.014464,
     "end_time": "2021-04-20T21:06:57.130426",
     "exception": false,
     "start_time": "2021-04-20T21:06:57.115962",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# One more graph (encompassing previous ones), where (F, X) are jointly determined by latent factors $A$. We can allow in fact the whole triple (D, F, X) to be jointly determined by latent factors $A$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.014208,
     "end_time": "2021-04-20T21:06:57.159030",
     "exception": false,
     "start_time": "2021-04-20T21:06:57.144822",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "This is much more realistic graph to consider."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.393034,
     "end_time": "2021-04-20T21:06:57.567429",
     "exception": false,
     "start_time": "2021-04-20T21:06:57.174395",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "G3 = dagitty('dag{\n",
    "Y [outcome,pos=\"4, 0\"]\n",
    "D [exposure,pos=\"0, 0\"]\n",
    "X [confounder, pos=\"2,-2\"]\n",
    "F [unobserved, pos=\"0, -1\"]\n",
    "A [unobserved, pos=\"-1, -1\"]\n",
    "D -> Y\n",
    "X -> D\n",
    "F -> D\n",
    "A -> F\n",
    "A -> X\n",
    "A -> D\n",
    "X -> Y}')\n",
    "\n",
    "adjustmentSets( G3, \"D\", \"Y\", effect=\"total\"  ) \n",
    "\n",
    "ggdag(G3)+  theme_dag()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.016255,
     "end_time": "2021-04-20T21:06:57.599705",
     "exception": false,
     "start_time": "2021-04-20T21:06:57.583450",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Threat to Idenitification: What if $F$ also directly affects $Y$? (Note that there are no valid adjustment sets in this case)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.40878,
     "end_time": "2021-04-20T21:06:58.024720",
     "exception": false,
     "start_time": "2021-04-20T21:06:57.615940",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "G4 = dagitty('dag{\n",
    "Y [outcome,pos=\"4, 0\"]\n",
    "D [exposure,pos=\"0, 0\"]\n",
    "X [confounder, pos=\"2,-2\"]\n",
    "F [unobserved, pos=\"0, -1\"]\n",
    "A [unobserved, pos=\"-1, -1\"]\n",
    "D -> Y\n",
    "X -> D\n",
    "F -> D\n",
    "A -> F\n",
    "A -> X\n",
    "A -> D\n",
    "F -> Y\n",
    "X -> Y}')\n",
    "\n",
    "\n",
    "ggdag(G4)+  theme_dag()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.046811,
     "end_time": "2021-04-20T21:06:58.089616",
     "exception": false,
     "start_time": "2021-04-20T21:06:58.042805",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "adjustmentSets( G4, \"D\", \"Y\",effect=\"total\"  )\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.017351,
     "end_time": "2021-04-20T21:06:58.124909",
     "exception": false,
     "start_time": "2021-04-20T21:06:58.107558",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "Note that no output means that there is no valid adustment set (among observed variables)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.017324,
     "end_time": "2021-04-20T21:06:58.159612",
     "exception": false,
     "start_time": "2021-04-20T21:06:58.142288",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# How can F affect Y directly? Is it reasonable?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.017002,
     "end_time": "2021-04-20T21:06:58.193722",
     "exception": false,
     "start_time": "2021-04-20T21:06:58.176720",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# Introduce Match Amount $M$ (very important mediator, why mediator?). $M$ is not observed.  Luckily adjusting for $X$ still works if there is no $F \\to M$ arrow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.385562,
     "end_time": "2021-04-20T21:06:58.596529",
     "exception": false,
     "start_time": "2021-04-20T21:06:58.210967",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "G5 = dagitty('dag{\n",
    "Y [outcome,pos=\"4, 0\"]\n",
    "D [exposure,pos=\"0, 0\"]\n",
    "X [confounder, pos=\"2,-2\"]\n",
    "F [unobserved, pos=\"0, -1\"]\n",
    "A [unobserved, pos=\"-1, -1\"]\n",
    "M [unobserved, pos=\"2, -.5\"]\n",
    "D -> Y\n",
    "X -> D\n",
    "F -> D\n",
    "A -> F\n",
    "A -> X\n",
    "A -> D\n",
    "D -> M\n",
    "M -> Y\n",
    "X -> M\n",
    "X -> Y}')\n",
    "\n",
    "print( adjustmentSets( G5, \"D\", \"Y\",effect=\"total\"  ) )\n",
    "\n",
    "ggdag(G5)+  theme_dag()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.019066,
     "end_time": "2021-04-20T21:06:58.635211",
     "exception": false,
     "start_time": "2021-04-20T21:06:58.616145",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    "# If  there is $F \\to M$ arrow, then adjusting for $X$ is not sufficient."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "papermill": {
     "duration": 0.365274,
     "end_time": "2021-04-20T21:06:59.019538",
     "exception": false,
     "start_time": "2021-04-20T21:06:58.654264",
     "status": "completed"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "G6 = dagitty('dag{\n",
    "Y [outcome,pos=\"4, 0\"]\n",
    "D [exposure,pos=\"0, 0\"]\n",
    "X [confounder, pos=\"2,-2\"]\n",
    "F [unobserved, pos=\"0, -1\"]\n",
    "A [unobserved, pos=\"-1, -1\"]\n",
    "M [uobserved, pos=\"2, -.5\"]\n",
    "D -> Y\n",
    "X -> D\n",
    "F -> D\n",
    "A -> F\n",
    "A -> X\n",
    "D -> M\n",
    "F -> M\n",
    "A -> D\n",
    "M -> Y\n",
    "X -> M\n",
    "X -> Y}')\n",
    "\n",
    "print( adjustmentSets( G6, \"D\", \"Y\" ),effect=\"total\"  )\n",
    "\n",
    "ggdag(G6)+  theme_dag()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "papermill": {
     "duration": 0.020751,
     "end_time": "2021-04-20T21:06:59.062138",
     "exception": false,
     "start_time": "2021-04-20T21:06:59.041387",
     "status": "completed"
    },
    "tags": []
   },
   "source": [
    " # Question:\n",
    " \n",
    "Given the analysis above, do you find the adjustment for workers' characteristics a credible strategy to identify the causal (total effect) of 401 (k) elligibility on net financial wealth?\n",
    "\n",
    "     * If yes, click an \"upvote\" button at the top\n",
    "     * If no, please click an \"upvote\" button at the top"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "R",
   "language": "R",
   "name": "ir"
  },
  "language_info": {
   "codemirror_mode": "r",
   "file_extension": ".r",
   "mimetype": "text/x-r-source",
   "name": "R",
   "pygments_lexer": "r",
   "version": "3.6.3"
  },
  "papermill": {
   "default_parameters": {},
   "duration": 32.278732,
   "end_time": "2021-04-20T21:06:59.193141",
   "environment_variables": {},
   "exception": null,
   "input_path": "__notebook__.ipynb",
   "output_path": "__notebook__.ipynb",
   "parameters": {},
   "start_time": "2021-04-20T21:06:26.914409",
   "version": "2.2.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
